Classification and regression trees

نویسنده

  • Wei-Yin Loh
چکیده

Classification and regression trees are machine-learning methods for constructing prediction models from data. The models are obtained by recursively partitioning the data space and fitting a simple prediction model within each partition. As a result, the partitioning can be represented graphically as a decision tree. Classification trees are designed for dependent variables that take a finite number of unordered values, with prediction error measured in terms of misclassification cost. Regression trees are for dependent variables that take continuous or ordered discrete values, with prediction error typically measured by the squared difference between the observed and predicted values. This article gives an introduction to the subject by reviewing some widely available algorithms and comparing their capabilities, strengths, and weakness in two examples. C © 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 14–23 DOI: 10.1002/widm.8

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predicting The Type of Malaria Using Classification and Regression Decision Trees

Predicting The Type of Malaria Using Classification and Regression Decision Trees Maryam Ashoori1 *, Fatemeh Hamzavi2 1School of Technical and Engineering, Higher Educational Complex of Saravan, Saravan, Iran 2School of Agriculture, Higher Educational Complex of Saravan, Saravan, Iran Abstract Background: Malaria is an infectious disease infecting 200 - 300 million people annually. Environme...

متن کامل

Factors Influencing Drug Injection History among Prisoners: A Comparison between Classification and Regression Trees and Logistic Regression Analysis

Background: Due to the importance of medical studies, researchers of this field should be familiar with various types of statistical analyses to select the most appropriate method based on the characteristics of their data sets. Classification and regression trees (CARTs) can be as complementary to regression models. We compared the performance of a logistic regression model and a CART in predi...

متن کامل

Assessing Behavioral Patterns of Motorcyclists Based on Traffic Control Device at City Intersections by Classification Tree Algorithm

According to the forensic statistics, in Iran, 26 percent of those killed in traffic accidents are motorcyclists in recent years. Thus, it is necessary to investigate the causes of motorcycle accidents because of the high number of motorcyclist casualties. Motorcyclists' dangerous behaviors are among the causes of events that are discussed in this study. Traffic signs have the important role of...

متن کامل

Prediction of melting points of a diverse chemical set using fuzzy regression tree

The classification and regression trees (CART) possess the advantage of being able to handlelarge data sets and yield readily interpretable models. In spite to these advantages, they are alsorecognized as highly unstable classifiers with respect to minor perturbations in the training data.In the other words methods present high variance. Fuzzy logic brings in an improvement in theseaspects due ...

متن کامل

Comparing Different Modeling Techniques for Predicting Presence-absence of Some Dominant Plant Species in Mountain Rangelands, Mazandaran Province

In applied studies, the investigation of the relationship between a plant species and environmental variables is essential to manage ecological problems and rangeland ecosystems. This research was conducted in summer 2016. The aim of this study was to compare the predictive power of a number of Species Distribution Models (SDMs) and to evaluate the importance of a range of environmental variabl...

متن کامل

Application of classification trees-J48 to model the presence of roach (Rutilus rutilus) in rivers

In the present study, classification trees (CTs-J48 algorithm) were used to study the occurrence of roach in rivers in Flanders (Belgium). The presence/absence of roach was modelled based on a set of river characteristics. The predictive performance of the CTs models was assessed based on the percentage of Correctly Classified Instances (CCI) and Cohen's kappa statistics. To find the best model...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Wiley Interdisc. Rew.: Data Mining and Knowledge Discovery

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2011